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(57) Abstract 



Messages (20) are routed through an array 
(10) of data processing nodes (N1-N16) which 
are intercoupled with channels in rows and 
columns. Under certain conditions (not state 30 
and not state 31a of Fig. 4), a message can exit 
a node in either one of two directions; and this 
enables the message to reach its destination by 
multiple routes. Under other conditions (state 
30 or state 31a of Fig. 4), the message must 
exit the node in only predetermined direction, 
and that direction is selected to avoid message 
routing deadlocks. 
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MULTI-PATH MESSAGE ROUTING WITHOUT DEADLOCKS 
BACKGROUND OF T HE INVENT TOW; 

This invention relates to the field of data 
processing; and more particularly, it relates to methods of 
routing messages through an array of data processing nodes 
5 such that multiple paths can be taken to reach a 
destination without causing message routing deadlocks to 
occur. 

As used herein, the term data processing node is 
meant to include the combination of at least the following 

10 items: a microprocessor chip, a memory coupled to the 
microprocessor, and input-output channels to and from the 
microprocessor. Here, the microprocessor /memory /input- 
output channels can have any internal make-up. 

Such a data processing node has use by itself in 

15 that the memory can store a program for the microprocessor 
to execute, and data can be sent to and received from the 
data processing node via the input-output channels. 
However, by interooupling multiple data processing nodes 



WO 95/30192 



PCIYUS95/05334 



-2- 

together in an array via their input-output channels, 
several advantages over a single data processing node are 
achieved. 

One advantage is that an array of nodes provides 
5 a selectable or scalable amount of computing power. To 
increase/decrease the computing power of the array, some 
nodes are simply added to/deleted from the array. 

Also, another advantage is that the array of 
nodes provides computing power which is fail-soft. This 
10 means that one or more nodes can fail and be in need of 
repair, while the remaining nodes in the array continue to 
operate. 

However, in any array of data processing nodes, 
an issue that needs to be addressed is how to route 

15 information in the form of messages from one node to 
another node. Such message routing is of course needed in 
order for the nodes of the array to work on data processing 
problems in a coordinated and cooperative fashion. 

Presently in the art, Intel Corporation sells a 

20 scalable parallel processor, called the "Paragon", which 
comprises an array of data processing nodes that are 
intercoupled with channels as a "mesh". Within this mesh, 
the data processing nodes are arranged in rows and columns; 
and messages are passed from node to node along those rows 

25 and columns. 

However, a major drawback with the above scalable 
parallel processor is that the route which each message 
takes from its source to its destination is fixed. 
Consequently, whenever the route for a message is blocked 

30 because it requires a channel that is busy, that message 
must wait for the channel to become available. Further, if 
the route for a message is blocked by a broken channel, 
that message will not reach its destination until the 
channel break is fixed. 

35 Accordingly, a primary object of the invention is 

to provide an improved method of routing messages through 
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an array of data processing nodes whereby the above 
drawbacks are overcome. 

BRIEF SUMMARY OF TOE INVENTION s 

With the present invention , messages are routed 
5 through an array of data processing nodes which are 
intercoupled with channels in rows and columns. Each 
message includes a header with a s x field which selects a +x 
or -X direction for the message to travel on the rows of 
channels, a Ax field which specifies the number of nodes 

10 through which the message must pass in the direction 
selected by the s, field, a s y field which selects a +Y or 
-Y direction for the message to travel on the columns of 
channels, and a AY field which specifies the number of 
nodes through which the message must pass in the direction 

15 selected by the s y field. When a message reaches a node 
through which it must pass, the header fields are examined 
to determine if Ajfe*o and AY*o. If that condition exists, 
then for two combinations of the S x and s y fields, the 
message is passed through the node in either the direction 

20 selected by S, or the direction selected by S y . For the 
remaining two combinations of the S x and S y fields, the 
message is passed through the node in a predetermined 
direction which is chosen such that the above variable 
message routing does not result in any message routing 

25 deadlock. How message routing deadlocks can occur by the 
variable routing, and how they are prevented, are explained 
in detail herein in conjunction with FIG's. 3A, 3B, and 4. 

BRIEF DESCRIPTION OF TOE DRAWTKflfl > 

Fig. l shows an array of data processing nodes 
30 which are intercoupled with channels in rows and columns. 

Fig. 2 shows the array of Fig. l together with a 
message format and several paths along which messages are 
routed in accordance with the present invention. 
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Figs. 3A and 3B are schematic diagrams which 
illustrate the occurrence of two types message routing 
deadlocks in the array of Pig. l. 

Pig. 4 lists twelve alternative pairs of routing 
5 limitations by which the message routing deadlocks of Figs, 
3 A and 3B are prevented. 

Fig. 5 shows a circuit which constitutes one 
preferred embodiment of each of the data processing nodes 
in the Fig. l array and by which the routing limitations of 
10 Fig. 4 are imposed. 

Fig. 6 shows additional details of the internal 
structure of a control module which lies within the Fig. 5 
data processing node. 

DETAILED DESCRIPTION Z 

15 Referring now to Fig. l, it shows an array (or 

mesh) 10 of data processing nodes N1-N16 through which 
messages are routed in accordance with the present 
invention. To pass those messages from one node to 
another, the nodes N1-N16 are intercoupled with full-duplex 

20 channels CH(l,2), CH<2,3), etc. in rows and columns. 

For example, node Nl is coupled to node N2 by a 
full-duplex channel CH(l,2) ; node N2 is coupled to node N3 
by a full-duplex channel CH(2,3); and node N3 is coupled to 
node N4 by a full-duplex channel CH<3,4). Those nodes Hi, 

25 N2, N3 and N4 together with the channels CH{1,2), CH<2,3) 
and CH(3,4) constitute one row in the array 10. 

Likewise, node Nl is coupled to node N5 by a 
full-duplex channel CH(1,5); node N5 is coupled to node N9 
by a full-duplex channel CH<5,9); and node N9 is coupled to 

30 node N13 by a full-duplex channel CH<9,13). Those nodes 
Nl, N5, N9 and N13 together with the channels CH(l,5), 
CH(5,9) and CH(9,13) constitute one column in the array 10. 

All of the nodes and full-duplex channels which 
make up each of the rows and columns in the array 10 are 

35 identified below in Table 1. 
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TABLE 1 



10 



15 



ROW 


C0l. 


Nodes and Channels 


1 




Ml, N2, N3, N4, 

CH(1,2), 011(2,3), CH(3,4) 


2 




N5, N6, N7, N8, 

CH(5,«), CH(6,7), CH(7,8) 


3 




N9, N10, Nil, K12, 

CH(9,10), CH(10,11), CH(11,12) 


4 




K13, N14, N15, N16, 

CH(13,14), CH(14,1S), CH(15,16) 




1 


HI, N5, N9, N13, 

CH(1,5), CH(5,9), CH(9,13) 




2 


N2, N7, Mil, M15, 

CH(3,7), CH(7,11), CH(11,15) 




3 


K3, N7, Mil, N15, 

CH(3,7), CH(7,11), CH(11,15) 




4 


N4, N8, N12, N16, 

CH<4,8), CH(8,14), CH(12,1«) 



Bach message which travels on a full-duplex 
channel in any row can go in either a +x direction or a -x 
direction; and those +X and -X directions are shown in Pig. 
1. Similarly, each message which travels on a full-duplex 
channel in any column can go in either a +Y direction or a 
-Y direction; and those +Y and -Y directions are also shown 
in Pig. 1. 

For example, a message on channel CH( 10,11) which 
goes from node Nio to node Mil, is traveling in the +x 
direction; whereas a message on channel CH(10,H) which 
goes from node Nil to node N10 is traveling in the -x 
direction. Similarly, a message on channel CH(3,7) which 
goes from node N7 to node N3 is traveling in the +y 
20 direction; whereas a message on channel CH(3,7) which goes 
from node N3 to node N7 is traveling in the -Y direction. 

Each message which travels on the channels in the 
array 10 has a format 20 as shown in Pig. 2. That format 
consists of two major parts- which are a header field 21 
25 and a data field 22. in general, the data field 22 
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contains information which a first node (the source node) 
is sending to a second node (the destination node) ; and the 
header field 21 contains information which is used to route 
the message from the source node to the destination node. 
5 More specifically, the header field 21 includes 

a B x field, a Ax field, a s Y field, and a Ay field. The s x 
field selects the +X direction or -x direction for the 
message to travel on the rows of the array 10; and the Ax 
field specifies the number of nodes through which the 
10 message must pass in the direction selected by the 8 X field 
in order to reach the destination node. Likewise, the 8 Y 
field selects the +Y direction or -Y direction for the 
message to travel on the columns of the array 10; and the 
AY field specifies the number of nodes through which the 
15 message must pass in the direction selected by the S y field 
in order to reach the destination node. 

As an example of the above, consider the case 
where node N9 sends a message to node N3. in that 
particular case, if the header 21 leaves node N9 in the +X 
20 direction, then that header will be as follows: B x selects 
+X direction, Ax=i, s v selects +y direction; AY=2. 

Now in accordance with one feature of the present 
invention, the above message will travel from node Nio to 
node N3 along any one of several different paths, one such 
25 path is indicated in Pig. 2 by reference numeral 23a; a 
second path is indicated by reference numeral 23b; and a 
third path is indicated by reference numeral 23c. Having 
a choice of several paths to route a message is more 
desirable than having just one path to route the message 
30 because the one path could be blocked by the passage of 
another message between another pair of nodes, or the one 
path could be broken. 

Bach time a message enters a node, that node 
examines the header to determine how the message should be 
35 routed, if AX*0 and AYsO, then the header is sent out of 
the node on a row in either the +X direction or -X 
direction as specified by the 8* field. If Ax=0 AND AY*o, 
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then the header is sent out of the node on a column in 
either the +Y direction or -Y direction as specified by the 
S Y field. 

If Ak*o and Ay*0, then the node makes a decision, 
5 in accordance with a second feature of the present 
invention, to send the header out of the node on a row in 
a direction specified by the 8 X field, or on a column in a 
direction specified by the S Y field. Exactly how this 
decision is made will be described shortly in conjunction 

10 with Pig. 3A, 3B, and 4. 

Each time the header is sent from a node on a 
row, then the Ax field is decremented by one. Similarly, 
each time the header is sent from a node on a column, the 
AY field is decremented by one. Thus, when a node receives 

15 a header with Ax=0 and AY=0, the message is for that node. 

Considering now FIG's. 3A, 3B, and 4, a preferred 
process by which the header is sent out of the node on 
either a row or column, when AXs*0 and AY*0, will be 
described. To understand this process, the concept of a 

20 message routing deadlock must first be understood, and two 
such deadlocks are illustrated in FIG's. 3A and 3B. 
Specifically, Fig. 3A illustrates a counterclockwise 
routing deadlock* whereas Fig. 3fi illustrates a clockwise 
routing deadlock. 

25 I» Fig. 3A, node N6 has a message to send to node 

Nil, and a route for that message is indicated as route R6. 
Similarly in Fig. 3A, node N7 has a message to send to node 
Nio, and a route for that message is indicated as route R7; 
node N10 has a message to send to node N7, and a route for 

30 that message is indicated as route RIO; and node Nil has a 
message to send to node N6, and a route for that message is 
indicated as route Rli. 

Each of the above routes R6, R7, Rio and Rll are 
shown partly with a solid line and partly with a dashed 

35 line. These solid lines illustrate where the messages have 
traveled, and these dashed lines illustrate where the 
messages still need to travel to reach their destination. 
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Por example, the solid line in route Re indicates that the 
message from node N6 has traveled to node N10; and the 
dashed line in route R6 indicates that the same message 
still needs to travel from node N10 to node Nil. 
5 Inspection of all of Fig. 3A message routes shows 

that no message is able to reach its final destination. 
That is because the Pig. 3A message routes form a loop 
wherein one part of each route is blocked by one part of 
another route. Por example, the message on route R6 can 
10 not travel from node N10 to node Nil because the message on 
route Rio is using the channel between nodes Nio and Nil in 
the +x direction. 

Similarly in Pig. 3B, node N6 has a message to 
send to node Nil, and a route for that message is indicated 
15 as route R6'; node N7 has a message to send to node Nio, 
and a route for that message is indicated as route R7'; 
node N10 has a message to send to node N7, and a route for 
that message is indicated as route RIO'; and node Nil has 
a message to send to node N6, and a route for that message 
20 is indicated as route Ml'. 

Here again, the above routes R6', R7', Rio' and 
Rll', are shown partly with a solid line which illustrates 
where the messages have traveled, and partly with a dashed 
line which illustrates where the messages still need to 
25 travel to reach their destination. Inspection of all of 
Pig. 3B message routes shows that no message is able to 
reach its final destination because those message routes 
form a loop wherein one part of each route is blocked by 
one part of another route. Por example, the message on 
30 route R6' can not travel from node N7 to node Nil because 
the message on route R7' is using the channel between nodes 
N7 and Nil in the -Y direction. 

Now, in accordance with the present invention, 
the above described message routing deadlocks of PIG's. 3A 
35 and 3B, are prevented by imposing routing limitations which 
are given in Pig. 4. Bach of those Pig. 4 limitations 
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apply only when a node receives a header where Ax^O and 
AY*o. 

To avoid the message routing deadlocks of FIG's. 
3A and 3B, one pair of routing limitations in Pig. 4 must 
5 be imposed by each of the data processing nodes N1-N16. 
Each node N1-N16 must impose the same pair of routing 
limitations; and that pair can be pair #1, or pair #2, ... 
or pair #12 as listed below in Table 2. 

TABLE 2 



10 


pair 


#1 ... 


limitations 


30 


and 


3la 




pair 


#2 ... 


limitations 


30 


and 


3lb 




pair 


#3 ... 


limitations 


30 


and 


31c 




pair 


#4 ... 


limitations 


32 


and 


33a 




pair 


#5 ... 


limitations 


32 


and 


33b 


15 


pair 


#6 ... 


limitations 


32 


and 


33c 




pair 


#7 ... 


limitations 


34 


and 


35a 




pair 


#8 ... 


limitations 


34 


and 


35b 




pair 


#9 ... 


limitations 


34 


and 


35c 




pair 


#10 .. 


limitations 


36 


and 


37a 


20 


pair 


#11 .. 


limitations 


36 


and 


37b 




pair 


#12 .. 


limitations 


36 


and 


37c 



According to limitation 30 , if a message has a 
header where 8 X selects a +X direction and S Y selects a +Y 
direction when Ax^O and AY*0, then that message should be 

25 routed from each node which receives that header in the +Y 
direction. A message with such a header is initiated in 
Fig. 3A from the node Nio. In Fig. 3A, however, the route 
RIO for the message from node Nio, goes first in the +X 
direction and then in the +Y direction. By changing the 

30 route RIO such that the message from node N10 goes first in 
the +Y direction and then in the +x direction, the routing 
loop of Fig. 3 A is broken; and that in turn eliminates the 
Fig. 3A counterclockwise routing deadlock. 

Recall that each time a message exits a node in 

35 the +Y direction, the AY field in the header is decremented 
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by one. Thus, the Ay field will eventually go to zero. 
When Ax*0 and Ay=0, limitation 30 will no longer apply and 
each node will route the message along a row in a direction 
selected by s x . 

5 Even when the routing limitation 30 in Pig. 4 is 

imposed, the clockwise routing deadlock of Pig. 3B can 
still occur. This is seen from Pig. 3B wherein the message 
which is initiated from the node N10, travels along the 
route RIO' that goes first in the +Y direction and then in 

10 the +X direction. Consequently, to eliminate the clockwise 
routing deadlock of Pig. 3B, limitation 30 needs to be 
imposed together with a second limitation; and in Pig. 4, 
three alternative pairs of routing limitations are given as 
pair #1, #2, and #3. 

15 Limitation 31A of pair #1 applies to messages 

which have a header where s x selects a +x direction and B Y 
selects a -Y direction when Ax*o and AY*0. In that case, 
limitation 31A causes each node which receives such a 
header to route the message in a -Y direction. For 

20 example, in Fig. 3B, the message which is initiated by node 
N6 along the route R6' must have a header which specifies 
the +X direction and the -Y direction for the message to 
travel; however, the message route R6' goes in the +X 
direction first and then in the -Y direction. Thus by 

25 imposing the limitation 31A, the message initiated by node 
N6 would travel first in the -Y direction to node N10 and 
then in the +X direction to node Nil; and that would 
prevent the clockwise routing loop of Pig. 3B from 
occurring. 

30 Similarly, limitation 31B of pair #2 also 

prevents the clockwise routing loop of Pig. 3B from 
occurring. According to limitation 31B, if a message has 
a header where s x specifies a -X direction and S y specifies 
a -Y direction when Ax*0 and AY*0, then that message should 

35 be sent from each node which receives the header in the -x 
direction. In Pig. 3B, the message which is initiated by 
node N7 must contain a header which specifies a -x 
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direction and a -Y direction for the message to travel 
since the route R7' goes in both of those directions. 
However, the route R7' goes first in the -Y direction; and 
by changing the route R7' such that it goes first in the -x 
5 direction, the clockwise routing deadlock loop of Pig, 3B 
will not occur. 

Likewise, the limitation 31C of pair #3 prevents 
a clockwise routing loop from occurring by limiting routes 
for messages with headers that specify a -X direction and 

10 a +Y direction when Ax*0 and AY;*o. Those messages, 
according to the limitation 31C, must travel first in the 
+Y direction. Such a message is initiated in Pig. 3B from 
the node in Nil since the route Rll' goes in both the -X 
and +Y direction. However, the route Rll' goes first in 

15 the -X direction; and by changing the route Rll' such that 
it goes first in the +Y direction, the clockwise routing 
loop of Pig. 3B is broken. 

Except for the above limitations 30 and 31a, or 
30 and 31b, or 30 and 31c, a message which has a header 

20 with Ax*o and Ay*0 can be routed in any direction as 
selected by the fields s x and 6 Y . Por example, if the 
routing limitations 30 and 3la are imposed, then a message 
which has a header where AX*0 and AY*0 and S x selects a -X 
direction and S v selects a +Y direction can be routed from 

25 a node in either the -x or +Y direction. If the channel 
which carries messages from the node in the +x direction is 
busy carrying another message, then the channel which 
carries messages from the node in the +Y direction can be 
used if it is not busy; and vice versa. 

30 Consider now the remaining routing limitation 

pairs (i.e. - pairs #4 - #12) of Table 2. There, each of 
the limitations 32, 34, and 36 prevents the 
counterclockwise deadlock of Pig. 3A from occurring; 
whereas each of the limitations 33a, 33b, 33c, 35a, 35b, 

35 35c, 37a, 37b, and 37c prevents the clockwise deadlock. 
For example, the limitation 32 applies to messages with 
headers where 6 X selects a -X direction and s Y selects a +Y 
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direction when Ax^o and Ay^o. Such a message, according to 
the limitation 32, must be passed in the -X direction from 
each node which receives the header. This limitation will 
prevent a message from taking the route Rii in Pig. 3A, and 
5 thereby prevent a counterclockwise loop. 

Likewise, the limitation 33a applies to messages 
with headers where S x selects a +X direction and S Y selects 
a +Y direction when Ax*0 and Ay*0. such a message, 
according to the limitation 33a, must be passed in the +x 

10 direction from each node which receives the header. This 
limitation will prevent a message from taking the route 
RIO' in Pig. 3B, and thereby prevent a clockwise loop. 

In a generic sense, the limitations 30, 32, 34 
and 36 can be restated as two process steps (l and 2) which 

15 each node must perform in routing a message. Step l is to 
detect if AX*o and Ay*o and the directions selected by S X S Y 
equal a first predetermined pair of directions, step 2 is 
to send the message through the node, when the step l 
detecting occurs, in one direction of the first 

20 predetermined pair such that it followed by the other 
direction of the first predetermined pair form a clockwise 
turn. 

Por example, with the limitation 36, the first 
predetermined pair of directions is the +x direction and 

25 the -Y direction. In that case, a message which travels in 
the +X direction followed by the -Y direction makes a 
clockwise turn, whereas a message which travels in the -Y 
direction followed by the +X direction makes a 
counterclockwise turn. Thus, the one direction of the 

30 first pair is the +x direction. 

Likewise in a generic sense, the limitations 
3la-3lc, 33a-33c, 35a-35c, and 37a-37c can be restated as 
two other process steps (3 and 4) which each node must 
perform in routing a message. Step 3 is to detect if Ax^O 

35 and AY^O and the directions selected by s x S y equal a second 
predetermined pair of directions, step 4 is to send the 
message through the node, when the step 3 detection occurs, 
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in the one direction of the second predetermined pair such 
that it followed by the other direction of the second 
predetermined pair form a counterclockwise turn. 

For example, with the limitation 37a, the second 
5 predetermined pair of directions is the -x and the -Y 
direction, m that case, a message which travels in the -x 
direction followed by the -Y direction makes a 
counterclockwise turn, whereas a message which travels in 
the -Y direction followed by the -X direction makes a 
10 clockwise turn. Thus, the one direction of the second pair 
is the -x direction. 

Suppose now that a node receives a message with 
a header where Ax*0 and AY*o and the directions selected by 
S x s Y equal neither the first or second predetermined pairs 
15 of directions, in that case, the message is sent through 
the node, in either one of the directions selected by s x s Y 
based on channel availability and without regard to whether 
the message will make a clockwise turn or counterclockwise 
turn. 

20 Turning next to FlG's. 5 and 6, a preferred 

circuit for each of the nodes, by which they each perform 
the above message routing process, will be described, in 
Fig. 5, all of the circuitry 50 which is there shown 
constitutes a single node. Thus, to build the previously 
described sixteen node array 10 of Fig. l, the circuit 50 
is replicated sixteen times. 

Included within the circuit 50 are five one-way 
input channels; and they are labeled +XI, -XI, +YI, -yi, 
and LI. Likewise included within the circuit 50 are five 
one-way output channels; and they are labeled +X0, -xo, 
+Y0, -Y0, and L0. A single one-way input channel plus a 
single one-way output channel corresponds to a single full- 
duplex channel in the Fig. i array. 

Specifically the correlation between the one-way 
channels in the circuit 50 and the full duplex channels of 
Fig. i are as follows: 
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-XO and +XI = full-duplex channel on left of node, 
+XO and -XI = full-duplex channel on right of node, 
-YO and +YI = full-duplex channel on bottom of node, 
+YO and -YI = full-duplex channel on top of node, 
5 LO and LI = full-duplex channel internal to node. 

For example, suppose the circuit 50 is used as 
node N6 in the Fig. l array 11. in that case, the channels 
correlate as follows: 

-X0 and +XI s CH(5,6) 
" +X0 and -XI = CH(6,7) 

-Y0 and +YI = CH(6,10) 
+Y0 and -YI = CH(2,6) 

Likewise, suppose the circuit 50 is used as node 
N13 in the Fig. l array. in that case, the channels 
15 correlate as follows: 

channels -X0 and +XI are not used 
+xo and -xi = CH(i3,i4) 
channels -Y0 and +YI are not used 
+Y0 and -YI s CH(9,13) 
20 Also included is the circuit 50 is a local data 

processing module 51 which receives messages from the one- 
way channel LO and which sends messages on the one-way 
channel LI. This data processing module 51 preferably 
includes a microprocessor integrated circuit chip and other 
25 supporting chips, such as a memory, which enable the 
microprocessor chip to receive, process, and send messages. 

Further included in the circuit 50 are five input 
buffers 52-1, 52-2, 52-3, 52-4 and 52-5 and a five-by-five 
crossbar switch 53. Each of the input buffers 52-1 thru 
52-5 is large enough to store the header portion 21 of one 
message. These buffers 52-1, 52-2, 52-3, 52-4 and 52-5 
respectively pass messages from the one-way input channels 
+XI, -XI, +yi # -YI and LI to the crossbar 53. Then the 
crossbar 53 passes the messages to respective one-way 
35 output channels +X0, -XO, +YO, -YO, and LO. 

In order to control which buffers 52-1 thru 52-5 
and which one-way output channels get inter coupled, the 
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circuit 50 also includes a control module 54. one 
respective set A, B, c, D, E of control signals is sent 
between the control module 54 and each of the input buffers 
52-1, 52-2, 52-3, 52-4 and 52-5; and another set P of 
control signals is sent between the control module 54 and 
the crossbar 53. These signal sets A - p are shown in 
detail in Pig. 6 together with a preferred internal 
structure for the control module 54. 

inspection of Pig. 6 shows that each of the input 
buffers sends a signal HDRCVD to an arbiter circuit 54 -i 
When buffer 52-1 receives the header of a message, the 
HDRCVD signal in the signal set A goes true. Likewise, 
when buffer 52-2 receives the header of a message, the 
HDRCVD signal in the signal set B goes true; etc. 

Within the arbiter 54-1, the true HDRCVD signals 
are selected one at a time. if the HDRCVD signal from 
buffer 52-1 is selected, then the arbiter 54-1 generates an 
output signal of SERVICED. Likewise, if the HDRCVD signal 
from buffer 52-2 is selected, then the arbiter generates an 
20 output signal of SERVICER, etc. These service signal are 
indicated in Pig. 6b as SERVICE=N. 

Prom the arbiter 54-1, the SERVICE=N signals are 
sent to the a multiplexor 54-2. Then, in response to the 
SERVICED signal, the multiplexor passes four signals, s„ 
ZAx, s y , ZAY from buffer 52-1 to the multiplexor output 54- 
2a. Likewise, in response to the service=2 signal, the 
multiplexor passes four signals s„ 2Ax, Sy , ZAY from 
buffer 52-2 to the multiplexor output 54-2a, etc. 

Signal 8, is true if the header in the selected 
input buffer specifies a +x direction for the message; and 
signal ZAx is true if the Ax field has zero magnitude. 
Likewise, signal 8, is true if the header in the selected 
input buffer specifies a +y direction for the message; and 
signal ZAY is true if the AY field has zero magnitude. 

All of the signals from the multiplexor output 
54 -2a are sent to a state machine 54-3. Also, the state 
machine 54-3 receives the SERVICE=N signals from the 
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arbiter 54-1, and it receives five other signals as part of 
the signal set p from the crossbar 53. These five signals 
are +XOBU8Y, -XOB0SY, +YOBUSY, -YOBUSY, LOBUSY. 

A true +XOBUSY signal indicates that the one-way 
5 output channel +XO is busy carrying a message from one of 
the input buffers, and thus it is not available to carry 
another message from a different input buffer. Likewise, 
a true -XOBU8Y signal indicates that the one-way output 
channel -XO is busy carrying a message from one of the 
10 input buffers, and thus it is not available to carry 
another message from a different buffer; etc. 

Based on all of the signals which the state 
machine 54-3 receives, the state machine generates five 
sets of commands to the crossbar 53; and those five command 
15 sets are shown in Pig. 6 as +XOCMD, -XOCMD, +YOCMD, -YOCMD, 
and LOCMD. in response to the -XOCMD, the crossbar 53 
couples the output of one of the buffers 51-1 thru 52-5 to 
the +XO one-way output channel. Likewise, in response to 
the -XOCMD, the crossbar 53 couples the output of one of 
20 the buffers to the -XO one-way channel; etc. 

Each of the above five command sets are generated 
by the state machine 54-3 in accordance with one of the 
previously described pairs of message routing limitations 
as given in Pig. 4. Por example, consider the case where 
25 the pair of routing limitations 30 and 31a are imposed; and 
further in that case, assume 8 x =:true, ZAX=false, 8 y =true, 
ZAYsfalse, and +YOBUBY=false. Then in that case, the state 
machine 54-3 will generate a +YOCMD which directs the 
crossbar to pass the output of buffer 52-1 to the +Y0 
30 output channel. if, however, +YOBUSY=true, then no new 
+Y0CMD is generated. 

Likewise, assume the routing limitations 30 and 
3la are again imposed; and further assume 6 x =true, 
ZAXsfalse, S } =false, zAY=false, and -YOBUSY=false. Then in 
35 that case, the state machine 54-3 will generate a -YOCMD 
directs the crossbar to pass the output of buffer 52-1 to 
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the -YO output channel. Here again, if -Y0BU8Y=true, then 
no -YOCHO is generated. 

Similarly, assume the routing limitations 30 and 
3 la are again imposed; and further assume S^false, 
5 zAx=falae, 8 y =true, ZAY=false, -XOBUSY=false, and 
+YOBUSY=false. Then in that case, the state machine 54-3 
will generate either an -XOCMD or a +YOCMD which 
respectively direct the crossbar to pass the output of 
buffer 52-1 to the +XO or +YO output channel. 
10 In tne above case where the state machine 54-3 

has to chose one of two commands to generate, that choice 
in a first embodiment is made on a random fashion, in a 
second embodiment, the choice is made on an alternating 
basis. Further is a third embodiment, the choice is made 
15 by pre-assigning priorities to the output channels. 

If the state machine 54-3 generates one of the 
commands +XOCMD or -XOCMD, that state machine then sends a 
MODAx pulse to a demultiplexer 54-4 . At the same time, the 
demultiplexer receives the BERVICE=N signals from the 
20 arbiter 54-1; and in response, the demultiplexer passes the 
MODAX pulse to the particular input buffer which the 
SERVICE=N signals select. For example, if SERVICED is 
true, the MODAx pulse is sent to buffer 52-1. Then in the 
input buffer which receives the MODAx pulse, the Ax field 
25 is decremented by one. 

Similarly, if the state machine 54-3 generates 
one of the commands +YOCMD or -YOCMD, that state machine 
then sends a MODAy pulse to the demultiplexer 54-41. m 
turn, the MODAy pulse is passed by the demultiplexer to the 
30 particular input buffer which the SERVICE=N signals select; 
and that input buffer then decrements the AY field by one. 

Lastly, the state machine 54-3 sends either an 
END signal or a TYRAG signal to the arbiter 54-1. if the 
state machine 54-3 selected an output channel from the 
35 crossbar 53 which was not busy, then the END signal is 
sent; otherwise the TRYAG signal is sent. In response to 
both the END signal and TRYAG signal, the arbiter 54-1 
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reseleets one of the true HDRRCVD signals and all of the 
above described operations by the Fig. 6 circuit are 
repeated. However, if the END signal was sent, the arbiter 
54-1 disregards the HDRRCDD signal that was last selected 
5 until it switches from a false state to a true state, which 
indicates that a new header has been received. 

Prom the above description of Pigs. 5 and 6, it 
should be evident that the data processing module 51 in any 
one node can send a message to the data processing module 
10 51 in any other node simply by loading the message with a 
proper header into the input buffer 52-5. Mote that when 
this header is loaded into the buffer 52-5, the Ax and Ay 
fields must account for the passage of the message from the 
buffer 52-5 through the crossbar 53 to one of the output 
15 channels +XO, -xo, +YO, or -yo. 

Por example, recall that is the description of 
Pig. 2, a message was sent from node M9 to node H3, and 
that message was described as leaving node M9 in the +X 
direction with a header of 8x=+x, Ax=i, Sy=+Y, Ay=2. 
20 However, within node N9, the header would be loaded into 
buffer 52-5 by the data processor si with fields of s x s+x, 
AX=2, S y =+Y, AY=2. 

Thereafter the arbiter 54-1 will select the local 
input bus LI for service; and then the state machine 54-3 

25 will determine whether to send the header out of the 
crossbar 52 in +X direction or the +Y direction. This 
choice will be made in accordance with one pair of message 
routing limitations from Pig. 4. if the +Y direction is 
selected, the message will leave node M9 with header fields 

30 of S,=+X, AX=2, S y =+Y, AY=1. 

One preferred method of routing messages through 
an array of data processing nodes, as well as one preferred 
structure for each node, has now been described in detail. 
In addition, however, various changes and modifications can 
35 be made to those details without departing from the nature 
and spirit of the invention. 
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In particular in Pig. 5, the input buffers 52-1 
thru 52-5 can be made of any type of flip-flops; the 
crossbar 53 can be made of any type of logic gates; and any 
type of microprocessor chip can be used for the data 
5 processor 51. Likewise, in Pig. 6, any type of logic gates 
and flip-flops can be used to construct the arbiter 54-1, 
the multiplexor 54-2, the state machine 54-3, and the 
demultiplexer 54-4. 

Accordingly, it is to be understood that the invention 
10 is not limited to the details of any one particular 
preferred embodiment but is defined by the appended claims. 
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WHAT IS C LAIMED Tfl ? 

la A method of routing a message through an array of 

data processing nodes which are intercoupled with channels 
in rows and columns; said message including a header with 
a 8 X field which selects a +X or -X direction for said 
5 message to travel on said rows of channels , a Ax field 
which specifies the number of nodes through which said 
message must pass in the direction selected by said s x 
field, a S Y field which selects a +Y or -Y direction for 
said message to travel on said columns of channels , and a 

10 AY field which specifies the number of nodes through which 
said message must pass in the direction selected by said S Y 
field; said method being performed by a node which receives 
said header and including the steps of: 

examining said header for a first state wherein 

15 AX*o and AY*o and S X S Y select a first predetermined pair of 
directions; 

sending said message through said node, if said 
first state exists, in the one direction of said first pair 
such that it followed by the other direction of said first 
20 pair form a clockwise turn; 

examining said header for a second state wherein 
Ax*0 and AY*o and S x s Y select a second predetermined pair of 
directions; 

sending said message through said node, if said 
25 second state exists, in the one direction of said second 
pair such that it followed by the other direction of said 
second pair form a counterclockwise turn; 

examining said header for a third state wherein 
AX*0 and AY*0 and s x S Y select neither said first or second 
30 predetermined pairs of directions; and, 

sending said message through said node, if said 
third state exists, in either one of the directions 
selected by S x S Y based on channel availability and without 
regard to whether said message will make a clockwise turn 
35 or counterclockwise turn. 



WO 95/30192 



PCT/US95/0S334 



-21- 

2 « A method according to claim l wherein said first 

predetermined pair of directions is +x+Y and said second 
predetermined pair of directions is +X-Y. 

3. A method according to claim l wherein said first 

predetermined pair of directions is +X+Y and said second 
predetermined pair of directions is -X-Y. 

4 * A method according to claim l wherein said first 

predetermined pair of directions is +X+Y and said second 
predetermined pair of directions is -X+Y. 

5. A method according to claim l wherein said first 
predetermined pair of directions is -x+Y and said second 
predetermined pair of directions is +X+Y. 

6. A method according to claim 1 wherein said first 
predetermined pair of directions is -x+Y and said second 
predetermined pair of directions is +X-Y. 

7. A method according to claim l wherein said first 
predetermined pair of directions is -x+Y and said second 
predetermined pair of directions is -X-Y. 

8 * A method according to claim l wherein said first 

predetermined pair of directions is -x-Y and said second 
predetermined pair of directions is -x+Y. 

9. A method according to claim l wherein said first 
predetermined pair of directions is -X-Y and said second 
predetermined pair of directions is +X+Y. 

10. a method according to claim 1 wherein said first 
predetermined pair of directions is -x-Y and said second 
predetermined pair of directions is +X-Y. 
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11 • A method according to claim l wherein said first 
predetermined pair of directions is +X-Y and said second 
predetermined pair of directions is -x-Y. 

12 • A method according to claim l wherein said first 
predetermined pair of directions is +X-Y and said second 
predetermined pair of directions is -x+Y. 

13 • * method according to claim l wherein said first 
predetermined pair of directions is +X-Y and said second 
predetermined pair of directions is +X+Y. 

14. a method according to claim 1 wherein , if said 

third state exists and two channels are available for 
sending said message in the directions selected by s x and 
8 Y , then the one direction in which said message is sent 
5 through said node is selected randomly. 

15 • A method according to claim 1 wherein, if said 

third state exists and two channels are available for 
sending said message in the directions selected by B x and 
S y/ then the one direction in which said message is sent 
5 through said node is selected in an alternating fashion. 

16. a method according to claim l wherein, if said 

third state exists and two channels are available for 
sending said message in the directions selected by s x and 
S Y , then the one direction in which said message is sent 
5 through said node is selected based on pre-assigned 
priorities. 
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